[#441] Preload embedding model at startup by sahilds1 · Pull Request #461 · CodeForPhilly/balancer-main

sahilds1 · 2026-02-13T20:51:46Z

Description

Preload SentenceTransformer model at Django startup before traffic is routed to the application instance
Add tests for the embeddings services by pulling apart the core logic to make testing easier

Related Issue

GitHub Issue #441

Manual Tests

sahildshah•~/github/balancer-main(441-embedding-models⚡)» docker compose up --build                                                                                        [15:15:29]

backend-1   | INFO 2026-03-10 19:16:04,973 _client 168 281473584205120 HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/sentence-transformers/paraphrase-MiniLM-L6-v2/c9a2bfebc254878aee8c3aca9e6844d5bbb102d1/config.json "HTTP/1.1 200 OK"
Loading weights: 100%|██████████| 103/103 [00:00<00:00, 4545.93it/s, Materializing param=pooler.dense.weight]

Automated Tests

sahildshah•~/github/balancer-main(441-embedding-models⚡)» docker compose exec backend pytest api/services/test_embedding_services.py -v                                      [15:30:11]
================================================================================== test session starts ==================================================================================
platform linux -- Python 3.11.4, pytest-9.0.2, pluggy-1.6.0 -- /usr/local/bin/python
cachedir: .pytest_cache
django: version: 4.2.3, settings: balancer_backend.settings (from ini)
rootdir: /usr/src/server
configfile: pytest.ini
plugins: django-4.11.1, anyio-4.12.1
collected 4 items

api/services/test_embedding_services.py::test_evaluate_query_maps_fields PASSED                                                                                                   [ 25%]
api/services/test_embedding_services.py::test_evaluate_query_none_upload_file PASSED                                                                                              [ 50%]
api/services/test_embedding_services.py::test_log_usage_computes_distance_stats PASSED                                                                                            [ 75%]
api/services/test_embedding_services.py::test_log_usage_swallows_exceptions PASSED                                                                                                [100%]

=================================================================================== 4 passed in 0.03s ===================================================================================

Documentation

Updated README with instructions for running backend tests

Reviewers

@taichan03 @amahuli03

Notes

… routed to the application instance

Copilot

Pull request overview

This PR aims to ensure the SentenceTransformer embedding model is loaded during Django startup (before traffic hits the instance) and to make the embeddings search logic more testable by factoring it into smaller functions.

Changes:

Refactors get_closest_embeddings by extracting query building, evaluation, and usage logging into helper functions.
Adds pytest + pytest-django support (requirements + pytest.ini) and new unit tests for embedding service helpers.
Updates GitHub Actions workflow and README to run backend tests.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`server/requirements.txt`	Adds pytest dependencies to support running backend tests.
`server/pytest.ini`	Configures pytest-django settings/module and python path for the server package.
`server/api/services/test_embedding_services.py`	Adds unit tests for query evaluation and usage logging helpers.
`server/api/services/embedding_services.py`	Refactors embeddings search into `build_query`, `evaluate_query`, `log_usage`, and reworks `get_closest_embeddings`.
`server/api/apps.py`	Attempts to preload the embedding model during Django app initialization via `ready()`.
`README.md`	Documents how to run backend tests inside the backend container.
`.github/workflows/python-app.yml`	Changes CI branch targets and adds dependency install + pytest execution.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot · 2026-03-11T17:56:02Z

server/api/services/embedding_services.py

@@ -32,59 +31,52 @@ def get_closest_embeddings(

    Returns
    -------
-    list[dict]
-        List of dictionaries containing embedding results with keys:
-        - name: document name
-        - text: embedded text content
-        - page_number: page number in source document
-        - chunk_number: chunk number within the document
-        - distance: L2 distance from query embedding
-        - file_id: GUID of the source file
+    QuerySet
+        Unevaluated Django QuerySet ordered by L2 distance, sliced to num_results
    """
-
-    encoding_start = time.time()
-    transformerModel = TransformerModel.get_instance().model
-    embedding_message = transformerModel.encode(message_data)
-    encoding_time = time.time() - encoding_start
-
-    db_query_start = time.time()
-
    # Django QuerySets are lazily evaluated
    if user.is_authenticated:
        # User sees their own files + files uploaded by superusers
-        closest_embeddings_query = (
-            Embeddings.objects.filter(
-                Q(upload_file__uploaded_by=user) | Q(upload_file__uploaded_by__is_superuser=True)
-            )
-            .annotate(
-                distance=L2Distance("embedding_sentence_transformers", embedding_message)
-            )
-            .order_by("distance")
+        queryset = Embeddings.objects.filter(
+            Q(upload_file__uploaded_by=user) | Q(upload_file__uploaded_by__is_superuser=True)
        )
    else:
        # Unauthenticated users only see superuser-uploaded files
-        closest_embeddings_query = (
-            Embeddings.objects.filter(upload_file__uploaded_by__is_superuser=True)
-            .annotate(
-                distance=L2Distance("embedding_sentence_transformers", embedding_message)
-            )
-            .order_by("distance")
-        )
+        queryset = Embeddings.objects.filter(upload_file__uploaded_by__is_superuser=True)
+
+    queryset = (
+        queryset
+        .annotate(distance=L2Distance("embedding_sentence_transformers", embedding_vector))
+        .order_by("distance")
+    )

    # Filtering to a document GUID takes precedence over a document name
    if guid:
-        closest_embeddings_query = closest_embeddings_query.filter(
-            upload_file__guid=guid
-        )
+        queryset = queryset.filter(upload_file__guid=guid)
    elif document_name:
-        closest_embeddings_query = closest_embeddings_query.filter(name=document_name)
+        queryset = queryset.filter(name=document_name)

    # Slicing is equivalent to SQL's LIMIT clause
-    closest_embeddings_query = closest_embeddings_query[:num_results]
+    return queryset[:num_results]


build_query() introduces/relocates important filtering + precedence logic (authenticated vs unauthenticated visibility; guid-over-document_name; LIMIT slicing), but the new tests only cover evaluate_query and log_usage. Add unit/integration tests covering build_query behavior (e.g., guid precedence and the authenticated/unauthenticated queryset filters) to prevent regressions in access control and filtering.

Copilot · 2026-03-11T17:56:02Z

server/api/apps.py

+    def ready(self):
+        from .services.sentencetTransformer_model import TransformerModel
+        TransformerModel.get_instance()


ApiConfig.ready() will only run if this AppConfig is actually used by Django. Right now INSTALLED_APPS appears to include just "api" (not "api.apps.ApiConfig"), and api/__init__.py doesn’t set a default config, so this preload hook may never execute. Consider updating INSTALLED_APPS to reference api.apps.ApiConfig (or otherwise ensuring this config is selected) so the model is preloaded as intended.

Copilot · 2026-03-11T17:56:02Z

server/api/apps.py

+    def ready(self):
+        from .services.sentencetTransformer_model import TransformerModel
+        TransformerModel.get_instance()


Calling TransformerModel.get_instance() unconditionally in ready() will run for every Django startup context (tests, migrations, management commands, autoreload) and can trigger a large model download/init even when no web traffic will be served. Consider gating this preload behind an explicit env flag (or limiting it to the web server entrypoint) to avoid slowing/fragilizing CI and one-off management commands.

Copilot · 2026-03-11T17:56:03Z

.github/workflows/python-app.yml

+      run: pip install -r server/requirements.txt
+      # Discover and run all files matching test_*.py or *_test.py under server/
+    - name: Run tests
+      run: pytest server/ -v


pytest.ini was added under server/, but this workflow runs pytest server/ -v from the repo root. Pytest won’t automatically discover config files in subdirectories, so DJANGO_SETTINGS_MODULE/pythonpath may not be applied and Django tests can fail to initialize. Fix by either cd server && pytest -v or running pytest -c server/pytest.ini server/ -v (or setting DJANGO_SETTINGS_MODULE in the workflow env).

Suggested change

run: pytest server/ -v

run: pytest -c server/pytest.ini server/ -v

REFACTOR Pull apart get_closest_embeddings to make testing easier

59b40f0

sahilds1 changed the title ~~REFACTOR Pull apart get_closest_embeddings to make testing easier~~ [DRAFT] [#441] Embedding Models Feb 13, 2026

sahilds1 self-assigned this Feb 13, 2026

sahilds1 added 2 commits February 13, 2026 16:12

ADD Add infra required to run pytest

3ffb74a

ADD Start adding tests for embedding_services"

12b09a7

sahilds1 changed the title ~~[DRAFT] [#441] Embedding Models~~ [WIP] [#441] Embedding Models Feb 13, 2026

sahilds1 added 2 commits February 17, 2026 14:40

DOC Add a note about running pytest in the README

da9afaa

Preload SentenceTransformer model at Django startup before traffic is…

5ce7782

… routed to the application instance

sahilds1 changed the title ~~[WIP] [#441] Embedding Models~~ [#441] Preload embedding model at startup Feb 27, 2026

sahilds1 requested review from amahuli03 and taichan03 March 10, 2026 19:34

sahilds1 marked this pull request as ready for review March 10, 2026 19:37

sahilds1 added 2 commits March 11, 2026 13:06

Merge branch 'develop' into 441-embedding-models

50a8bd3

Run python-app workflow on pushes and PRs to develop branch

795f218

sahilds1 requested a review from Copilot March 11, 2026 17:51

Copilot started reviewing on behalf of sahilds1 March 11, 2026 17:52 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[#441] Preload embedding model at startup#461

[#441] Preload embedding model at startup#461
sahilds1 wants to merge 7 commits intoCodeForPhilly:developfrom
sahilds1:441-embedding-models

sahilds1 commented Feb 13, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 11, 2026

Uh oh!

Copilot AI Mar 11, 2026

Uh oh!

Copilot AI Mar 11, 2026

Uh oh!

Copilot AI Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	run: pytest server/ -v
	run: pytest -c server/pytest.ini server/ -v

Uh oh!

Conversation

sahilds1 commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

Manual Tests

Automated Tests

Documentation

Reviewers

Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sahilds1 commented Feb 13, 2026 •

edited

Loading